65 research outputs found

    Improving novelty detection using the reconstructions of nearest neighbours

    Get PDF
    We show that using nearest neighbours in the latent space of autoencoders (AE) significantly improves performance of semi-supervised novelty detection in both single and multi-class contexts. Autoencoding methods detect novelty by learning to differentiate between the non-novel training class(es) and all other unseen classes. Our method harnesses a combination of the reconstructions of the nearest neighbours and the latent-neighbour distances of a given input's latent representation. We demonstrate that our nearest-latent-neighbours (NLN) algorithm is memory and time efficient, does not require significant data augmentation, nor is reliant on pre-trained networks. Furthermore, we show that the NLN-algorithm is easily applicable to multiple datasets without modification. Additionally, the proposed algorithm is agnostic to autoencoder architecture and reconstruction error method. We validate our method across several standard datasets for a variety of different autoencoding architectures such as vanilla, adversarial and variational autoencoders using either reconstruction, residual or feature consistent losses. The results show that the NLN algorithm grants up to a 17% increase in Area Under the Receiver Operating Characteristics (AUROC) curve performance for the multi-class case and 8% for single-class novelty detection

    Learning to detect radio frequency interference in radio astronomy without seeing it

    Get PDF
    Radio Frequency Interference (RFI) corrupts astronomical measurements, thus affecting the performance of radio telescopes. To address this problem, supervised segmentation models have been proposed as candidate solutions to RFI detection. However, the unavailability of large labelled datasets, due to the prohibitive cost of annotating, makes these solutions unusable. To solve these shortcomings, we focus on the inverse problem; training models on only uncontaminated emissions thereby learning to discriminate RFI from all known astronomical signals and system noise. We use Nearest-Latent-Neighbours (NLN) - an algorithm that utilises both the reconstructions and latent distances to the nearest-neighbours in the latent space of generative autoencoding models for novelty detection. The uncontaminated regions are selected using weak-labels in the form of RFI flags (generated by classical RFI flagging methods) available from most radio astronomical data archives at no additional cost. We evaluate performance on two independent datasets, one simulated from the HERA telescope and another consisting of real observations from LOFAR telescope. Additionally, we provide a small expert-labelled LOFAR dataset (i.e., strong labels) for evaluation of our and other methods. Performance is measured using AUROC, AUPRC and the maximum F1-score for a fixed threshold. For the simulated data we outperform the current state-of-the-art by approximately 1% in AUROC and 3% in AUPRC for the HERA dataset. Furthermore, our algorithm offers both a 4% increase in AUROC and AUPRC at a cost of a degradation in F1-score performance for the LOFAR dataset, without any manual labelling

    Towards exascale real-time RFI mitigation

    No full text
    We describe the design and implementation of an extremely scalable real-time RFI mitigation method, based on the offline AOFlagger. All algorithms scale linearly in the number of samples. We describe how we implemented the flagger in the LOFAR real-time pipeline, on both CPUs and GPUs. Additionally, we introduce a novel simple history-based flagger that helps reduce the impact of our small window on the data. By examining an observation of a known pulsar, we demonstrate that our flagger can achieve much higher quality than a simple thresholder, even when running in real time, on a distributed system. The flagger works on visibility data, but also on raw voltages, and beam formed data. The algorithms are scale-invariant, and work on microsecond to second time scales. We are currently implementing a prototype for the time domain pipeline of the SKA central signal processor
    • …
    corecore